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Abstract. In the context of this paper, a record is an entry in a sequence of random 
variables (RV's) that is larger or smaller than all previous entries. After a brief review 
of the classic theory of records, which is largely restricted to sequences of independent 
and identically distributed (i.i.d.) RV's, new results for sequences of independent RV's 
with distributions that broaden or sharpen with time are presented. In particular, 
we show that when the width of the distribution grows as a power law in time n, 
the mean number of records is asymptotically of order Inn for distributions with a 
power law tail (the Frechet class of extremal value statistics), of order (Inn)^ for 
distributions of exponential type {Gumbel class), and of order n^/('^+^^ for distributions 
of bounded support ( Weibull class), where the exponent v describes the behaviour of 
the distribution at the upper (or lower) boundary. Simulations are presented which 
indicate that, in contrast to the i.i.d. case, the sequence of record breaking events is 
correlated in such a way that the variance of the number of records is asymptotically 
smaller than the mean. 



1. Introduction 

A record is an entry in a discrete time series that is larger {upper record) or smaller {lower 
record) than all previous entries. Thus, records are extreme values that are defined not 
relative to a fixed threshold, but relative to all preceding events that have occurred 
since the beginning of the process. Statistical data in areas like meteorology [H |2[ [3l H] , 
hydrology O E] and athletics [Tj are naturally represented in terms of records. Records 
play an important role in the public perception of issues like anthropogenic climate 
change and natural disasters such as fioods and earthquakes, and they are an integral 
part of popular culture. Indeed, the Guinness Book of Records, first published in 1955, 
is the world's most sold copyrighted book. 



The mathematical theory of records was initiated more than 50 years ago and 
it is now a mature subfield of probability theory and statistics; see [U [TOl [11], [12] for 
reviews and [13j for an elementary introduction. Most of this work has been devoted to 
the case when the time series under consideration consists of independent, identically 
distributed (i.i.d.) random variables (RV's). For the following discussion, it will be 
useful to distinguish between the record times at which the current record is broken 
and replaced by a new one, and the associated record values. One of the key results of 
record theory is that the statistical properties of record times for real- valued i.i.d. RV's 
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are completely independent of the underlying distribution. To illustrate the origin of 
this universality, we recall the basic observation that the probability P„ for a record to 
occur in the n'th time step (the record rate) is given by 

Pn=- (1) 

n 

for i.i.d. RV's, because each of the n first entries Xi, X„, including the last, is equally 
likely to be the largest or smallest. The mean number of records up to time n, Rn, is 
therefore given by the harmonic series 

n "1 

:R;=5]Pfc = ^-^lnn + 7 + 0(l/n) for n oo, (2) 

k=i k=i ^ 

with 7 ^ 0.5772156649.... Further considerations along the same lines lead to a 
remarkably complete characterization of record times, which will be briefly reviewed 
below in section 12.11 The universality of record times can be exploited in statistical 
tests of the i.i.d. property of a given sequence of variables, without the need for any 
hypothesis about the underlying distribution |9]. By contrast, distributions of record 
values fall into three distinct universality classes, which are largely analogous to the well- 
known asymptotic laws of extreme value statistics for distributions with exponential-like 
tails {Gumbel), bounded support {WeibuU) and power law tails {Frechet), respectively 
[HE]. 

The decay of the record rate ([T]) with increasing n implies that the record breaking 
events form a non- stationary time series with unusual statistical properties, which 
will be further discussed below in section 12. 1[ Record dynamics has therefore been 
proposed as a paradigm for the non-stationary temporal behaviour of diverse complex 
systems ranging from the low-temperature relaxation of spin glasses to the co-evolution 
of biological populations [161 113 HE]- In fact, records appear naturally in the theory of 
biological adaptation, because any evolutionary innovation that successfully spreads in a 
population must be a record, in the sense that it accomplishes some task encountered by 
the organism in a way that is superior to all previously existing solutions. Consequently 
the statistics of records and extremes has been invoked to understand the distribution of 
fitness increments in adaptive processes [191 EO] as well as the timing of adaptive events 
[2T| [22| [23| [211 [251 [26] . In the biological context the universality of record time statistics 
is particularly attractive, because genotypical fitness is a somewhat elusive notion that 
is hard to quantify in terms of explicit probability distributions. 

Surprisingly few result on record statistics are known that go beyond the standard 
setting of i.i.d. RV's, and thus consider correlated and/or non-identically distributed 
RV's. In the present article we focus exclusively on the latter issue, while maintaining 
the independence among the entries in the sequence. A simple example of this type was 
introduced by Yang in an attempt to explain the frequency of occurrence of Olympic 
records, which is much higher than would be expected on the basis of the i.i.d. theory 
[27] . In his model a specified number of i.i.d. RV's become available simultaneously in 
each time step, corresponding, in the athletic context, to a variable (growing) population 
from which the contenders are drawn. Much of the standard theory can be extended to 
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this case [TUl \T2\ (see section for a brief review). In particular, one finds that the 
record rate becomes asymptotically constant for exponentially growing populations. An 
application of Yang's model to evolutionary searches in the space of genotypic sequences 
can be found in [2ll [25] . A second line of research has addressed the case of sequences 
with a linear trend, in which the n'th entry is of the form 

Xn = Yn + cn (3) 

with i.i.d. RV's F„ and c > [281 ISll [30]. Also in this case the record rate becomes 
asymptotically constant, see section [2^2] for details. 

The effect of trends on the occurrence rate of records is a key issue in the ongoing 
debate about the observable consequences of global warming [H [21 [3], [11 [5]. In this 
context it has been pointed out that climate variability is presumably a more important 
factor in determining the frequency of extreme events than averages [31] . It is therefore 
of considerable interest to investigate the record statistics of sequences of uncorrelated 
RV's in which the shape of the underlying probability distribution changes systematically 
with time. To initiate such an investigation is the goal of the present paper. Throughout 
we assume that the probability density Pn{X) of the n'th entry X„ is of the form 

p„(X) = A„ n(A„X) (4) 

where 11 (X) is a fixed normalized distribution and the A„ usually have a power-law time 
dependence 

A„ = Ao n-", (5) 

so that a > (a < 0) corresponds to a broadening (sharpening) distribution. 

After a brief review of a few important classic results of the theory of records 
in section [21 our new results for non-indentically distributed random variables will be 
presented in section [31 We focus on the asymptotic behaviour of the record rate P„ and 
the mean number of records Rn- Preliminary numerical results for the variance of the 
number of records are reported in section 13.31 but a more complete characterization of 
record times and record values is left for future work. Finally, some concluding remarks 
are offered in section [H 



2. Brief survey of classic results 

Given the distributions Pk{X) of the entries in a sequence of independent RV's, the 
probability P„ that the n'th entry is an upper record is equal to the probability that 
Xn > Xk for a\\ k < n. Henc^ 

/n— 1 
rfX„p„(X„) ngfc(^n), (6) 
k=l 



I Here and in the following limits of integration are omitted whenever the domain of integration is 
tinderstood to comprise the entire support of the probability distribution. 
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where 



rX 

qk{X) = / dxpkix) (7) 



is the cumulative distribution of X^. Similarly the probability that X„ is a lower record 
reads 

P: = I dXn Pn{Xn) n [1 - QkiXn)]. (8) 
k=l 

Equations ([6]) and ([8]) form the basis for most of what follows. 
2.1. Records from i.i.d. random variables 

For i.i.d. RV's the integral ([6]) can be performed by noting that pk, Qk ^ PiQ and 
dq = p dX, which yields the universal result ([1]). To arrive at a characterization of the 
record time process beyond the mean number of records -R„ we introduce the record 
indicator variables In, which take the value = 1 iff X„ is a record, and = else. It 
turns out that the /„ are independent [HI [H] , and hence they form a Bernoulli process 
with success probability P„. To see why this is so, consider the two-point correlation 
function lilj and assume that j > i. Then the key idea is that the right hand side of 



lilj = Prob[Xj = max(Xi, ...,Xj) and Xj = max(Xi, ...,Xj)] (9) 
can be split into independent events according to 



IJj = Prob[Xi = max(Xi, ...,Xj)] x Prob[Xj = max(Xj+i, ...,Xj)] x 

xProb[max(Xi, ...,Xi) < max(Xj+i, X^)]. (10) 

Following the symmetry argument used to derive ([1]), the first two factors are 1/i and 
— i), respectively, and the third factor can be written as 

Prob[max(Xi, Xj) occurs in {Xj+i, Xj}] = — — . (11) 



We conclude that 



1 1 j -i 11 



hi, = - — ^ = -- = P,P^ = (/,)(/^.)- (12) 
t J -t J I ] 

Higher order correlations can be shown to factorize in the same way. The number 
of records up to time n can then be expressed in terms of the indicator variables as 

n 

Rn = T.^k, (13) 
k=l 



and the variance of i?„ is 



1 1 



{Rn - RnY = Y.(Pk-Pl) = Y.[T-Tr2]^^^^ + l- ^V6 + 0{l/n) (14) 



k=i fc=i ^ 

for n oo. The index of dispersion pn of the record time process [191 ES], defined as 
the ratio of the variance to the mean 

{Rn ~ Rn)"^ /I 
pn = == (15) 
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thus tends to unity, and the distribution of the i?„ becomes Poissonian with mean In n 
for large n. The record times form a log-Poisson process p!6| [T7| [22] . 

A second useful observation concerns the ratios between consecutive record times. 
Let denote the time of the m'th record, with ti = 1 by convention. Repeating 
the symmetry argument used to derive ([I]), we expect that given t^, the preceding 
(m — l)'th record occurs with equal probability anywhere in the interval [l,tm]- This 
is not quite correct, because the previous m — 2 records also have to be accomodated, 
but since m ~ In(tm) this is a small correction which can be neglected for large m. It is 
therefore plausible (and can be proved [33j) that the ratio tm-i/^m tends to a uniformly 
distributed RV Um G [0, 1] for large m. Moreover the Um become independent in this 
limit [3l]. This allows us to highlight a peculiar property of the sequence of record 
breaking events: The expected value of tm-i, given tm, is 
/-i 1 

tm^l\t^ =tm duU= -tm, (16) 

JO 2 

but the reverse conditioning yields an infinite expectation, because 

t^\tm-i = tm-1 I du U'^ = oo. (17) 
Jo 

In this sense, the occurrence of records can be predicted only with hindsight, but not 
forward in time. 



2.2. Growing and improving populations 

In the model for growing populations introduced by Yang [27] and elaborated by 
Nevzorov [10], a number Nn of of i.i.d. RV's becomes available simultaneously at time 
n. The sjmametry argument in section [1] is easily extended to this case: Because of the 
i.i.d. property, the probability that there is a record among the newly generated 
RV's is equal to the ratio of Nn to the total number of RV's that have appeared up to 
time n, and hence 

Pn = (18) 

The independence of the record indicator variables Yn introduced above in section 12.11 
continues to hold [10], [25] , so again the sequence of record breaking events is a Bernouilli 
process with success probability P„. 

To give a simple example for the consequences of ( |T8i) . suppose the Nn grow 
exponentially as as a" with a > 1. This could model a sequence of athletic competitions 
in an exponentially growing population, where each athlete is assumed to be able to 
participate only in one event [27]. Then the evaluation of f[T8l) yields 

„ a"(a — 1) a — 1 ^ 

Pn= } ' ^ for n^oo, 19 

a(a" — 1) a 

and the distribution of inter-record times tm — t^-i is geometric. In his analysis of 
Olympic records Yang estimated a growth factor of a ^ 1.08 for the four-year period 
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between two games, and concluded that this growth rate was insufficient to explain the 
observed high frequency of records. 

Motivated by this outcome, Ballerini and Resnick [28] considered a model of 
improving populations, where the sequence of RV's displays a linear drift according 
to ([3]). They showed that the record rate tends to an asymptotic limit -P(c) given by 

P(c) = Jim P„ = J dyp{y) G^{y), (20) 

where p{Y) is the probability density of the i.i.d. RV's in ([3]) and 

n-1 

Gooiv) = lini ProbfYfc — ck < y for all k = 1, ...,n — 11 = lim TT q(y + ck), (21) 

k=l 

with q{Y) = dz p{z). The function P{c) has the obvious limits -P(O) = 
and limc_>oo -P(c) = 1, but the explicit evaluation is generally difficult. A simple 
expression is obtained when q{Y) is of Gumbel form, qiY) = exp[— e^^^^], which yields 
P(c) = 1 — e~^/^. For further details on the model ([3]) and applications to athletic data 
we refer to [28l [29l [30] . Results for specific distributions and an application to global 
warming can be found in [1]. 



3. Records in sequences with increasing or decreasing variance 

In this section we want to evaluate the record rates and ([8]) for distributions of the 
general form (jlj). Introducing the cumulative distribution corresponding to Il{X), 

Q(X) = dx7r{x), (22) 



the record rates of interest can be written as 

/n— 1 
dz U{z) n Qi^h/K) (23) 
k=i 

and 

/n— 1 
dzU{z) nil -Q{zXk/Xn)], (24) 
k=i 

which makes clear the obvious fact that the overall scale of the A„'s is without 
importance. 



3.1. Simple cases 

In some special cases the record rates can be evaluated exactly for arbitrary choices of 
the A„'s. For example, for the exponential distribution 

n(X) = e"^, X > (25) 

we have Q{X) = 1 — e^^, and the evaluation of the lower record rate yields 

P: = ^J^- (26) 

l^k=l 
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Inserting the power law behaviour we see that the denominator converges to the 
Riemann zeta function C{a) for a > 1, so that P* — > n~°'/({a) for large n, and the 
expected number of lower records 

n 

K = T.Pl (27) 

k=l 

remains finite for n oo. For a < 1 we have instead that P* (\ — a)/n for large n, 
and hence 

'W^^{l-a)\Tin (28) 

asymptotically. As would be intuitively expected, the occurrence of lower records 
is enhanced for sharpening distributions (a < 0) and suppressed for broadening 
distributions (a > 0). Finally, in the borderline case a = 1 we find 

R*^ ^ ln(ln(n)), (29) 

which is our first example of a nontrivial asymptotic law that differs qualitatively from 
the i.i.d. result (E]). 

A simple explicit expression for the upper record rate P„ can be obtained for the 
uniform distribution characterized by 

Q{X) = X for < X < 1 (30) 

when the are increasing, in the sense that Xk/^n < 1 for all k < n, i.e. for the case 
of a sharpening uniform distribution. Then the arguments of Q on the right hand side 
of ( |23ll are all less than unity, and direct integration yields 

p„ = i n f ■ (31) 

k=i 

Inserting the power law form ([5]) with a < one finds that the record rate decays 
exponentially as P„ ~ e"", and hence the asymptotic number of records is finite for all 
a < 0. 



3.2. Asymptotics of the mean number of records 

In this section we focus on broadening distributions, a > 0, and evaluate the upper 
record rate fl23l) asymptotically for representatives of all three universality classes of 
extreme value statistics. The starting point is to replace the product on the right hand 
side of (1251) by the exponential of a sum of logarithms, and to replace the latter by an 
integral. It then follows that the asymptotic behaviour of the record rate is given by 

P„ ^ Jdz n(z)e"3'^(") = dQ e"f"(^(<3)), (32) 

The second representation will prove to be useful in the final evaluation of P„. Note 
that z can always be expressed in terms of Q because dQ/dz = 11 > 0. The function Qa 
is given by 

gaiz)= r dulnQiz/u")^- du{l-Q{z/u'')), (33) 
Jo Jo 
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where in the second step it has been used that the integral in ([32D is dominated for large 
n by the region where Qa ^ and Q ^ 1. It is therefore clear that the asymptotic 
behaviour of the record rate depends only on the tail of Q, and hence universality in 
the sense of standard extreme value statistics should apply. 

3.2.1. Frechet class The evaluation of (133|) is straightforward for the Frechet class of 
distributions with power law tails. We set 

Q{X) = 1- X-^, X>1 (34) 

and obtain 

gc,{z) ^ -(1 + a/i)-^^-'^ = -(1 + afi)-\l - Q{z)). (35) 
Inserting this into fl32|) yields 

P„ ^ dQ e-"(i+-/^)-^(i-Q) ^ (36) 
JO n 

for large n, and hence 

^ ^ (1 + a/i) Inn. (37) 

This result remains valid for negative a as long as afi > —1. When afi < —1 the 
evaluation of shows that P„ ~ n"^ and thus the asymptotic number of records 
remains finite. 



3.2.2. Gumhel class The Gumbel class comprises unbounded distributions whose tail 
decays faster than a power law [m [15]. A typical representative is the exponentical 
distribution (!25|) with Q{X) = 1 — . Evaluation of (l33l) yields 



^ dvv-^'+'/''^e-^ = -^T{-l/a,z), 



a Jz a 
where r{—l/a,z) denotes the incomplete gamma function. For large z we have 

so that 

e-' 1 - Q{z) 



az 



which yields 



Pr 



^ ' dQ exp 

10 



aln(l-Q(z))' 

n{l-Q) 
aln(l-g) 



dv exp[— nt>/(a ln(l/t>))]. 



To further evaluate the integral we substitute w = {n/ lnn)v and obtain 



Inn /-n/lnn 



n Jo 
\nn r°° 



n Jo 



dw exp 

dw e"""/" 



wlnn 



Q;(ln?T, — ln(ln?7,) — \nw) 
a Inn 
n 



(38) 

3 

(39) 
(40) 

(41) 



(42) 
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for n — s> oo. Correspondingly the mean number of records grows as 

Rr,^-{\nnf. (43) 

A second important representative of the Gumbel class is the Gaussian (normal) 
distribution, for which 

Q{X)^1--^^ for X->oo. (44) 
Proceeding as before, we find 

«"(^' - -ii^*-^/^ - - - 2^^^y (^^' 

which becomes identical to ( l40l) upon replacing a by 2a. We conclude that P„ ~ 
(2alnn)/n and -R„ ~ 2a(lnn)^ for the Gaussian case. Although this does not constitute 
a strict proof, it strongly indicates that the behaviour Rn ~ (Inn)^ is universal within 
this class of probability distributions. 

3.2.3. Weibull class As a representative of the Weibull class of distributions with finite 
support we first consider the uniform distribution fl30|) . The integral on the right hand 
side of ( l33i) can then be evaluated without approximating InQ by —(1 — Q), and one 
obtains 

ga{z)=f' du\n(^)=lnz + a{l-z'/''). (46) 

This is a negative monotonically increasing function which vanishes quadratically in 
1 — z near z = 1, 

g^{z)^-^{l-zf = -^{l-Qy for z,Q^l. (47) 

The evaluation of the record rate fl32l) then yields 

Pn ^ dQ exp[-n(l - Q^)/2a] ^ ^— (48) 

for large n, and the number of records grows asymptotically as y/n. The specific power 
is clearly related to the quadratic behaviour of ga near z = 1, which in turn reflects the 
behaviour of Q{X) near the upper boundary X = 1. More generally we may consider 
bounded distributions of the form 

Q(X) = 1 - (1 -X)^ 0<X<1, (49) 

with u > and the uniform case corresponding to u = 1. To extract the leading order 
behaviour of ga for 2; ^ 1 we write 



1 



du(l- z/u'^Y = C dw"(i+^/")(l - vY 

1/q a 



z 



- , ^A^-^y^ (50) 

a[v + 1) 

for z 1. Hence the record rate decays as n^^/^'^+^'> and the mean number of records 
grows as 

^" (jy + 1)1+1/(1^+1) • y'^^i 




Figure 1. Simulation results for the mean number of records for distributions of 
Gumbel type. Full lines show data obtained for the exponential distribution with 
a — 2, a — 1 and a = 1/2. The dashed line shows data obtained for the Gaussian 
distribution and a = I. The thin dotted line is the harmonic series ^ which applies 
universally for a — 0. The short bold dotted lines show the predicted slope a/2 for 
the exponential case and a in the Gaussian case. All data were obtained from 10'* 
realizations of time series of length 10^. 



3.3. Simulations 

The asymptotic laws (|37l l43l ISTI) were first discovered in simulations, and they have 
subsequently been numerically verified for a variety of parameter values. As an example, 
we show in Figure [T] numerical data for the mean number of records obtained for 
distributions in the Gumbel class. There are significant corrections to the asymptotic 
behaviour for the Gaussian distribution as well as for the exponentical distribution with 
a = 2. This is not surprising in view of the approximations used in the derivation of 
f H3|) : for example, the last step in fH2|) requires that Inn ^ ln(ln?T,) which is true only 
for enormously large values of n. 

Simulations have also been used to investigate the occurrence of correlations in the 
record time process for a > 0. We have seen in section [271] that the Poisson statistics of 
Rn is a consequence of the fact that the record indicator variables In are independent in 
the i.i.d. case. In particular, f|T^ shows that the variance of Rn is asympotically equal 
to the mean whenever the /„ are uncorrelated and the record rate P„ tends to zero for 
n — s> oo in such a way that Rn diverges. As this is true for a > in all cases that we 
have considered, the index of dispersion (fTS!) can be used as a probe for correlations. 
The data displayed in Figure [2] clearly show that the asymptotic value of p„ is less 
than unity and independent of a for the uniform distribution. Similar results have been 
obtained for the exponential distribution, whereas we find that — 1 for the power 
law case. We conclude that, at least in certain cases, the record time process becomes 




Figure 2. Simulation results for the ratio of the variance of the number of records to 
the mean obtained using the uniform distribution with a = 0, 1/2, 1 and 2. While the 
data for a = approach the asymptotic Poisson limit of unity according to (fT4)) . the 
data for a > converge to a universal sub-Poissonian value. The data were obtained 
from 10^ realizations of time series of length 10^. 



more regular than the log-Poisson process when the underlying distribution broadens 
with time. 

4. Summary and discussion 

The main results of this paper are the asymptotic laws fl37l H3| |5T|) for the mean number 
of records in sequences of random variables drawn from broadening distributions. In 
all three cases the exponent a governing the time dependence of the width of the 
distribution enters only in the prefactors and does not affect the functional form of 
the result. Comparing the three cases, we see that the effect of the broadening on 
Rn is stronger the faster the underlying distribution n(X) decays for large arguments: 
For fat-tailed power law distributions the number of records remains logarithmic, for 
exponential-like distributions it changes from Inn to (Inn)^, while for distributions with 
bounded support the logarithm speeds up to a power law in time. 

Apart from the presentation of new results, a secondary purpose of this paper 
has been to advertise record dynamics as a paradigm of non-stationary point processes 
with interesting mathematical properties and wide-spread applications ranging from 
fundamental issues in the dynamics of complex systems to the consequences of climatic 
change. In the present work we have combined the intrinsic non-stationarity of record 
dynamics with an explicit non-stationarity of the underlying sequence of random 
variables. This turns out to be a relevant modification which may alter the basic 
logarithmic time-dependence of the mean number of records, and it can induce 
correlations among the record times, as detected in deviations of the index of dispersion 
(|T5l) from unity. It is worth noting that evidence for such correlations can also be found 
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in recent applications of record dynamics in simulations of complex systems [T71 [TS] . 
An analytic understanding of the origin of correlations in the models presented here is 
clearly an important goal for the near future. 
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